Search CORE

62 research outputs found

Editorial to special issue on energy efficient architectures for embedded systems

Author: Nunez-Yanez Jose
Roma Nuno
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/10/2016
Field of study

Crossref

Explore Bristol Research

High Performance Multi-Standard Architecture for DCT Computation in H.264/AVC High Profile and HEVC Codecs

Author: Dias Tiago
Roma Nuno
Sousa Leonel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2013
Field of study

A new high performance architecture for the computation of all the DCT operations adopted in the H.264/AVC and HEVC standards is proposed in this paper. Contrasting to other dedicated transform cores, the presented multi-standard transform architecture is supported on a completely configurable, scalable and unified structure, that is able to compute not only the forward and the inverse 8×8 and 4×4 integer DCTs and the 4×4 and 2×2 Hadamard transforms defined in the H.264/AVC standard, but also the 4×4, 8×8, 16×16 and 32×32 integer transforms adopted in HEVC. Experimental results obtained using a Xilinx Virtex-7 FPGA demonstrated the superior performance and hardware efficiency levels provided by the proposed structure, which outperforms its more prominent related designs by at least 1.8 times. When integrated in a multi-core embedded system, this architecture allows the computation, in real-time, of all the transforms mentioned above for resolutions as high as the 8k Ultra High Definition Television (UHDTV) (7680×4320 @ 30fps)

Repositório Científico do Instituto Politécnico de Lisboa

Hardware/software co-design of H.264/AVC encoders for multi-core embedded systems

Author: Dias Tiago
Roma Nuno
Sousa Leonel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2010
Field of study

This paper presents a multi-core H.264/AVC encoder suitable for implementations in small and medium complexity embedded systems. The proposed structure results from an efficient hardware/software co-design methodology, where the encoder software application is highly optimized and structured in a very modular and efficient manner, so as to allow its most complex and time consuming operations to be offloaded to dedicated hardware accelerators. The considered methodology adopts a simple and efficient core interconnection mechanism to easily allow the inclusion and the removal of such optimized processing cores. Experimental results obtained with the implementation in a Virtex4 FPGA of an H.264/AVC encoder using an ASIP IP core as a ME hardware accelerator have proven the advantages of this methodology. For the considered system, speedup factors greater than 15 were obtained with a very modest increase of the involved hardware resources.info:eu-repo/semantics/publishedVersio

Repositório Científico do Instituto Politécnico de Lisboa

Crossref

Efficient Parallel Video Encoding on Heterogeneous Systems

Author: Ilic Aleksandar
Momcilovic Svetislav
Roma Nuno
Sousa Leonel
Publication venue
Publication date: 01/01/2014
Field of study

Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014.In this study we propose an efficient method for collaborative H.264/AVC inter-loop encoding in heterogeneous CPU+GPU systems. This method relies on specifically developed extensive library of highly optimized parallel algorithms for both CPU and GPU architectures, and all inter-loop modules. In order to minimize the overall encoding time, this method integrates adaptive load balancing for the most computationally intensive, inter-prediction modules, which is based on dynamically built functional performance models of heterogenous devices and inter-loop modules. The proposed method also introduces efficient communication-aware techniques, which maximize data reusing, and decrease the overhead of expensive data transfers in collaborative video encoding. The experimental results show that the proposed method is able of achieving real-time video encoding for very demanding video coding parameters, i.e., full HD video format, 64×64 pixels search area and the exhaustive motion estimation.This work was supported by national funds through FCT – Fundação para a Ciência e a Tecnologia, under projects PEst-OE/EEI/LA0021/2013, PTDC/EEI-ELC/3152/2012 and PTDC/EEA-ELC/117329/2010

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Scalable unified transform architecture for advanced video coding embedded systems

Author: Dias Tiago
Lopez Sebastian
Roma Nuno
Sousa Leonel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/10/2012
Field of study

A novel high throughput and scalable unified architecture for the computation of the transform operations in video codecs for advanced standards is presented in this paper. This structure can be used as a hardware accelerator in modern embedded systems to efficiently compute all the two-dimensional 4 x 4 and 2 x 2 transforms of the H.264/AVC standard. Moreover, its highly flexible design and hardware efficiency allows it to be easily scaled in terms of performance and hardware cost to meet the specific requirements of any given video coding application. Experimental results obtained using a Xilinx Virtex-5 FPGA demonstrated the superior performance and hardware efficiency levels provided by the proposed structure, which presents a throughput per unit of area relatively higher than other similar recently published designs targeting the H.264/AVC standard. Such results also showed that, when integrated in a multi-core embedded system, this architecture provides speedup factors of about 120x concerning pure software implementations of the transform algorithms, therefore allowing the computation, in real-time, of all the above mentioned transforms for Ultra High Definition Video (UHDV) sequences (4,320 x 7,680 @ 30 fps)

Repositório Científico do Instituto Politécnico de Lisboa

Crossref

High throughput and scalable architecture for unified transform coding in embedded H.264/AVC video coding systems

Author: Dias Tiago
Lopez Sebastian
Roma Nuno
Sousa Leonel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2011
Field of study

An innovative high throughput and scalable multi-transform architecture for H.264/AVC is presented in this paper. This structure can be used as a hardware accelerator in modern embedded systems to efficiently compute the 4×4 forward/inverse integer DCT, as well as the 2-D 4×4 / 2×2 Hadamard transforms. Moreover, its highly flexible design and hardware efficiency allows it to be easily scaled in terms of performance and hardware cost to meet the specific requirements of any given video coding application. Experimental results obtained using a Xilinx Virtex-4 FPGA demonstrate the superior performance and hardware efficiency levels provided by the proposed structure, which presents a throughput per unit of area at least 1.8× higher than other similar recently published designs. Furthermore, such results also showed that this architecture can compute, in realtime, all the above mentioned H.264/AVC transforms for video sequences with resolutions up to UHDV.info:eu-repo/semantics/publishedVersio

Repositório Científico do Instituto Politécnico de Lisboa

Crossref

Least squares motion estimation algorithm in the compressed DCT domain for H.26x/MPEG-x video sequences

Author: Leonel Sousa
Nuno Roma
Publication venue
Publication date: 01/01/2005
Field of study

Abstrac

CiteSeerX

GPU Parallelization of HEVC In-Loop Filters

Author: Chi Chi Ching
de Souza Diego F.
Ilic Aleksandar
Juurlink Ben
Roma Nuno
Sousa Leonel
Wang Biao
Álvarez-Mesa Mauricio
Publication venue
Publication date: 01/01/2017
Field of study

In the High Efficiency Video Coding (HEVC) standard, multiple decoding modules have been designed to take advantage of parallel processing. In particular, the HEVC in-loop filters (i.e., the deblocking filter and sample adaptive offset) were conceived to be exploited by parallel architectures. However, the type of the offered parallelism mostly suits the capabilities of multi-core CPUs, thus making a real challenge to efficiently exploit massively parallel architectures such as Graphic Processing Units (GPUs), mainly due to the existing data dependencies between the HEVC decoding procedures. In accordance, this paper presents a novel strategy to increase the amount of parallelism and the resulting performance of the HEVC in-loop filters on GPU devices. For this purpose, the proposed algorithm performs the HEVC filtering at frame-level and employs intrinsic GPU vector instructions. When compared to the state-of-the-art HEVC in-loop filter implementations, the proposed approach also reduces the amount of required memory transfers, thus further boosting the performance. Experimental results show that the proposed GPU in-loop filters deliver a significant improvement in decoding performance. For example, average frame rates of 76 frames per second (FPS) and 125 FPS for Ultra HD 4K are achieved on an embedded NVIDIA GPU for All Intra and Random Access configurations, respectively

DepositOnce

Highly parallel HEVC decoding for heterogeneous systems with CPU and GPU

Author: Chi Chi Ching
de Souza Diego F.
Ilic Aleksandar
Juurlink Ben
Roma Nuno
Sousa Leonel
Wang Biao
Álvarez-Mesa Mauricio
Publication venue
Publication date: 01/01/2017
Field of study

The High Efficiency Video Coding HEVC standard provides a higher compression efficiency than other video coding standards but at the cost of an increased computational load, which makes hard to achieve real-time encoding/decoding for ultra high-resolution and high-quality video sequences. Graphics Processing Units GPU are known to provide massive processing capability for highly parallel and regular computing kernels, but not all HEVC decoding procedures are suited for GPU execution. Furthermore, if HEVC decoding is accelerated by GPUs, energy efficiency is another concern for heterogeneous CPU+GPU decoding. In this paper, a highly parallel HEVC decoder for heterogeneous CPU+GPU system is proposed. It exploits available parallelism in HEVC decoding on the CPU, GPU, and between the CPU and GPU devices simultaneously. On top of that, different workload balancing schemes can be selected according to the devoted CPU and GPU computing resources. Furthermore, an energy optimized solution is proposed by tuning GPU clock rates. Results show that the proposed decoder achieves better performance than the state-of-the-art CPU decoder, and the best performance among the workload balancing schemes depends on the available CPU and GPU computing resources. In particular, with an NVIDIA Titan X Maxwell GPU and an Intel Xeon E5-2699v3 CPU, the proposed decoder delivers 167 frames per second (fps) for Ultra HD 4K videos, when four CPU cores are used. Compared to the state-of-the-art CPU decoder using four CPU cores, the proposed decoder gains a speedup factor of . When decoding performance is bounded by the CPU, a system wise energy reduction up to 36% is achieved by using fixed (and lower) GPU clocks, compared to the default dynamic clock settings on the GPU.EC/H2020/688759/EU/Low-Power Parallel Computing on GPUs 2/LPGPU

DepositOnce

A contação de histórias na reciclagem de óleo de dendê usado pelas baianas de acarajé, no eixo Itabuna-Ilhéus (Ba)/ Storytelling in the recycling of palm oil used by the Acarajé Bahian women in Itabuna-Ilhéus (Ba)

Author: Nascimento Nuno Avelar
Passos Christian Ricardo Silva
Silva Luhyris Nascimento Costa
Souza José Roberto Campos de
Tomaz Aleide Roma
Publication venue: Brazilian Journals Publicações de Periódicos e Editora Ltda.
Publication date: 16/10/2020
Field of study

O presente trabalho teve como finalidade analisar qualitativamente a contribuição da contação de histórias como recurso didático no Ensino de Ciências e conscientização ambiental As histórias representam indicadores efetivos para situações desafiadoras, assim como fortalecem vínculos sociais, educativos e afetivos. Tendo isso em vista, o objetivo deste trabalho foi observar a contação de histórias enquanto recurso para o Ensino de Ciências, assim envolver a nossa comunidade em ações de proteção do meio ambiente e de promoção do desenvolvimento social. Concluímos que as atividades desenvolvidas na sala, com todas participantes, propiciaram a elas uma interação com o conteúdo científico de forma lúdica e prazerosa, desempenhando um importante papel na formação de um indivíduo crítico e criativo, propiciando uma amplitude de visões da sociedade como um todo

Brazilian Journals